A Tiling Perspective for Register Optimization

نویسندگان

  • Lukasz Domagala
  • Fabrice Rastello
  • P. Sadayappan
  • Duco van Amstel
چکیده

Register allocation is a much studied problem. A particularly important context for optimizing register allocation is within loops, since a significant fraction of the execution time of programs is often inside loop code. A variety of algorithms have been proposed in the past for register allocation, but the complexity of the problem has resulted in a decoupling of several important aspects, including loop unrolling, register promotion, and instruction reordering. In this paper, we develop an approach to register allocation and promotion in a unified optimization framework that simultaneously considers the impact of loop unrolling and instruction scheduling. This is done via a novel instruction tiling approach where instructions within a loop are represented along one dimension and innermost loop iterations along the other dimension. By exploiting the regularity along the loop dimension, and imposing essential dependence based constraints on intratile execution order, the problem of optimizing register pressure is cast in a constraint programming formalism. Experimental results are provided from thousands of innermost loops extracted from the SPEC benchmarks, demonstrating improvements over the current state-of-the-art. Key-words: compilation, compiler optimisation, register allocation,register spilling, register promotion, scheduling, constraint programming, loop transformations, loop unrolling, tiling, register tiling, locality ∗ Inria † Inria ‡ OSU § Inria Perspective de tuillage pour l’optimisation de registres. Résumé : L’allocation de registres est un problème largement étudié. Un contexte particulièrement important pour l’optimisation de l’allocation de registres est celui des boucles car elles constituent une fraction importante du temps d’exécution du programme. De nombreux algorithmes d’allocation de registres ont été proposés dans le passé mais la complexité du problème à donné lieu à un découplage de plusieurs aspects importants, incluant notamment le déroulage de boucles, la promotion de registres ou le réordonnance d’instructions. Dans ce rapport nous développons une approche unifiée au problème d’allocation et promotion de registres dans un cadre d’optimisation qui combine l’impact du déroulage de boucles et le réordonnancement d’instructions. Ceci est réalisé grâce à une nouvelle approche de pavageregistres dans lequel les instructions du corps de boucle sont représentées le long d’une dimension et les itérations de la boucle interne le long d’une autre dimension. En profitant de régularités le long d’une dimension et en imposant à l’ordre intra-tuile les contraintes de dépendances, le problème d’optimisation de la pression registres est exprimée dans un formalisme de programmation par contraintes. Les résultats expérimentaux issus de milliers de boucles internes extraites de la suite de benchmarks SPEC, démontrent l’amélioration par rapport à l’état de l’art. Mots-clés : compilation, optimisation de compilation, allocation de registres, vidage en mémoire, promotion de registres, ordonnancement, programmation par contraintes, transformation de boucles, déroulage de boucle, découpage de boucles, pavage, localité A Tiling Perspective for Register Optimization 3

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Evaluation of Tiling for the Register Level

Tiling is a well-known loop transformation, which is basically used to expose coarse-grain parallelism and to exploit data reuse at the cache level. However, it can also be used to exploit data reuse at the register level and to improve programs's ILP. Previous work on tiling and also commercial compilers are able to perform tiling for the register level in more than one dimension when the iter...

متن کامل

A Quantitative Algorithm for Data Locality Optimization

In this paper, we consider the problem of optimizing register allocation and cache behavior for loop array references. We exploit techniques developed initially for data locality estimation and improvement in the framework of cache or local memories. First we review the concept of \reference window" that serves as our basic tool for both data locality evaluation and management. Then we study ho...

متن کامل

A Compiler Perspective on Architectural Evolutions

Certain architectural features either constrain or inhibit compiler optimizations. We suggest three hardware changes aimed to improve the situation, from a compiler’s perspective. These changes involve redesigns of translation lookaside buffers, communication in memory hierarchies, and page mapping hardware for caches. Keywords— cache, compiler, optimization, PlayDoh, prefetch, tiling, TLB

متن کامل

Hierarchical tiling for improved superscalar performance

It takes more than a good algorithm to achieve high performance: inner-loop performance and data locality are also important. Tiling is a well-known method for parallelization and for improving data locality. However, tiling has the potential of being even more beneecial. At the nest granularity, it can be used to guide register allocation and instruction scheduling; at the coarsest level, it c...

متن کامل

PrimeTile: A Parametric Multi-Level Tiler for Imperfect Loop Nests

Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multi-level tiled code is essential for maximizing data reuse in systems with deep memory hierarchies. Tiled loops with parametric tile sizes (not compile-time constants) facilitate runtime feedback and dynamic optimizations used in iterative compilation and automatic tu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1406.0582  شماره 

صفحات  -

تاریخ انتشار 2014